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The efficacy of a specially constructed Gallager-type error- 
correcting code to communication in a Gaussian channel is 
being examined. The construction is based on the introduc- 
tion of complex matrices, used in both encoding and decod- 
ing, which comprise sub-matrices of cascading connection val- 
ues. The finite size effects are estimated for comparing the 
results to the bounds set by Shannon. The critical noise level 
achieved for certain code-rates and infinitely large systems 
nearly saturates the bounds set by Shannon even when the 
connectivity used is low. 



Information transmission is typically corrupted by 
noise during transmission. Various strategies have been 
adopted for reducing or eliminating the noise in the re- 
ceived message. One of the main approaches is the use 
of error-correcting codes whereby the original message is 
encoded prior to transmission in a manner that enables 
the retrieval of the original message from the corrupted 
transmission. The maximal transmission rate is bounded 
by the channel capacity derived by Shannon [jlj in his 
ground breaking work of 1948, which does not provide 
specific constructions of optimal codes. 

Various types of error-correcting codes have been de- 
vised over the years (for a review see 0]) for improving 
the transmission efficiency, most of them are generally 
still below Shannon's limit. We will concentrate here on 
a member of the parity-check codes family introduced by 
Gallager |], termed the MN code (§ and on a specific 
construction suggested by us previously H for the Binary 
Symmetric Channel (BSC). 

The connection between parity-check codes and sta- 
tistical physics has been first pointed out in Ref. ||, by 
mapping the decoding problem onto that of a particu- 
lar Ising-system with multi-spin interactions. The cor- 
responding Hamiltonian has been investigated in both 
fully-connected || and diluted systems for deriving 
the typical performance of these codes; more complex ar- 
chitectures, somewhat similar to those examined below 
have been investigated in ||, establishing the connec- 
tion between statistical physics and Gallager type codes. 
Most of these studies have been carried out for a par- 
ticular channel model, the BSC, whereby a fraction of 
the transmitted vector bits is flipped at random during 
transmission. 

However, different noise models may be considered for 
simulating communication in various media. One of the 
most commonly used noise models, which is arguably 



the most suitable one for a wide range of applications, 
is that of additive Gaussian noise (usually termed Ad- 
ditive White Gaussian Noise- AWGN in the literature). 
In this scenario, a message comprising N binary bits is 
transmitted through a noisy communication channel; a 
certain power level is used in transmitting the informa- 
tion which we will choose to be ±1 for simplicity. The 
transmitted message is then corrupted by additive Gaus- 
sian noise of zero mean and some variance a 2 ; the re- 
ceived (real valued) message is then decoded to retrieve 
the original message. 

The receiver can correct the flipped bits only if the 
source transmits M> N bits; the ratio between the orig- 
inal number of bits and those of the transmitted message 
R = N/M constitutes the code-rate for unbiased mes- 
sages. The channel capacity in the case of real-valued 
transmissions corrupted by Gaussian noise, which pro- 
vides the bound on the maximal code rate R c , is given 
explicitly EO] by 



R c = ~log(l + v 2 /a 2 ) 



(1) 



where v 2 is the power used for transmission (which we 
take here to be ±1) and v 2 jo 2 is therefore the signal to 
noise ratio. However, we will focus here on binary source 
messages; this reduces the maximal code rate to fnj 



i? c =-y dy P{y) log P(y)+ J dyP(y\x = x ) log P(y\x = x ) 

(2) 

where a; is a transmitted bit (of value xq = ±1) and y 
the received bit after corruption by an additive Gaussian 
noise, such that 



P(y) = 



i 



-(y- X ) 2 /(2a 2 ) +e -(y+x) 2 /(2a 2 ) 
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The specific error-correcting code that we will use here 
is a variation of the Gallager code ||. It became popu- 
lar recently due to the excellent performance of its regu- 
lar Q), irregular |ll]-|l3| and the cascading connection 
pj versions. In the original method, the transmitted 
message comprises the original message itself and addi- 
tional bits, each of which is derived from the parity of 
a sum of certain message- vector bits. The choice of the 
message- vector elements used for generating single code- 
word bits is carried out according to a predetermined 
random set-up and may be represented by a product of 
a randomly generated sparse matrix and the message- 
vector in a manner explained below. Decoding the re- 
ceived message relies on iterative probabilistic methods 
like belief propagation or belief revision . 
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In the MN code one constructs two sparse matrices 
A and B of dimensionalities MxN and M xM respec- 
tively. The matrix A has K non-zero (unit) elements 
per row and C(= KM/N) per column while B has L per 
row/column. The matrix B~ x A is then used for encoding 
the message 

ts = B~ l A s (mod 2) . 

The Boolean message vector ts is then transmitted as a 
vector t of real-valued elements, which we will choose for 
simplicity as ±1, and is corrupted by a real- valued noise 
vector i>, where each element is sampled from a Gaussian 
distribution of zero mean and variance a . The received 
message is of the form 



Using the noise model and the probability of the trans- 
mitted bit being t M = ±1: 



Pit, = ±l|rv) 



572 — 



1 



l + e 



(3) 



one can easily convert the real-valued noise v to a flip 
noise vector such that the probability of an error = 1 
(error) is 



P{n„ = 1) 



1 



(4) 



l + e" 



Note that P(n, = 1) may be larger than 1/2. The noise 
vector n and our estimate for the transmitted vector t 
are defined probabilistically by using the probabilities de- 
rived in Eq.(|4|) and Eq.(|3|) respectively. 

Having an estimate for the transmitted vector t as well 
as an estimate for the noise vector n, one decodes the 
binary received message t by employing the matrix B to 
obtain: 



z = B t = As + Bn 
This requires solving the equation 

[A,B] 



(5) 



s 

n' 



where s' and n' are the unknowns. This is being car- 
ried out here using methods of belief network decoding 
J|,[l4]], where pseudo-posterior probabilities, for the de- 
coded message bits being or 1, are calculated by solving 
iteratively a set of equations for the conditional proba- 
bilities of the codeword bits given the decoded message 
and vice versa. For exact details of the method used and 
the equation themselves see j| . Two differences from the 
framework used in the case of a Binary Symmetric Chan- 
nel (BSC) that should be noticed: 1) The probabilities 



of Eq.([|) and Eq.(g) may be used for defining the priors 
for single components of the noise and signal vectors re- 
spectively. 2) Initial conditions for the noise part of the 
dynamics may also be derived using Eq.(^). 

The key point in obtaining improved performance is 
the construction of the matrices A and B. The original 
MN code Q as well as that of Gallager || advocated the 
use of regular architectures with fixed column connectiv- 
ity; it also suggested that fixed K values may be pre- 
ferred. Recent work in the area of irregular codes |ll] [H| 
suggest that irregular codes have the potential of provid- 
ing superior performance which nearly saturates Shan- 
non's limit. These methods concentrate on different col- 
umn connectivities and use high K and C values (up to 
50), which of course increase the complexity of the algo- 
rithm and the decoding time required. Decoding delays 
are of major consideration in most practical applications. 

Our method uses the same structure as the MN codes 
and builds on insight gained from the study of physical 
systems with symmetric and asymmetric jl6| multi-spin 
interactions and from examining special cases of Gal- 
lager's method [|7|||. Our previous studies for the binary 
symmetric channel jH) suggest that a careful construction, 
based on different K and L values for the sub-matrices 
of A and B respectively, while keeping the connectiv- 
ity of each of the sub-matrices (and of the matrix as a 
whole) as uniform as possible, will provide the best re- 
sults. The guidelines for this architecture are given be- 
low and come from the mean-field calculations of Refs. 
|p|,P7|, showing that the choice of low K and L value 
codes results in a large basin of attraction but imper- 
fect end-magnetisation, while codes with higher K and 
L values can potentially saturate Shannon's bound but 
suffer from a rapidly decreasing basin of attraction as K 
and L increase. To exploit the advantages of both ar- 
chitectures and obtain optimal performance, a cascading 
method was suggested |5 17 whereby one constructs the 



matrices A and B from sub-matrices of different K and 
L values; such that lower values will drive the overlap in- 
crease between the decoded and the original messages to 
a level that enables the higher connectivity sub-matrices 
to come into play, allowing the system to converge to the 
perfectly decoded message [jl7| . 

Optimising the trade-off between having a large basin 
of attraction and improved end magnetisation can be 
done straightforwardly jlTj in the case of simple codes 
Q but is not very easy in general. Guidelines for op- 
timising the construction in the general case have been 
provided in Ref. ||; the key points include: 1) The first 
sub-matrices are characterised by low K and L values 
(< 2), while K values in subsequent sub-matrices are 
chosen gradually higher, so as to support the correction 
of faulty bits, and L = 1. 2) Keeping the number of non- 
zero column elements as uniform as possible (preferably 
fixed). 3) To guarantee the inversion of the matrix J5, 
and since noise bits have no explicit correlation, we use 
a patterned structure, Bi^ = <5i,fc + <5j,fc+57 for the i?-sub- 
matrices with L = 2 and Bi^ — 8i : k for L = 1. 4) The 
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sub-matrix with the lowest K value, which dominated 
the dynamics in the initial stage, low magnetisation, has 
to include some odd K values in order to break the inver- 
sion symmetry, otherwise the two solutions with m = ±l 
are equally attractive. It was also found to dramatically 
improve the convergence times. 

We will now focus on two specific architectures, con- 
structed for the cases of R — 1/2 and R = 1/4, for 
demonstrating the exceptional performance obtained by 
employing this method. In each of the cases we divided 
the composed matrix [A\B] to several sub-matrices char- 
acterised by specific K and L values as explained in table 
1; the dimensionalities of the full A and B matrices are 
M x N and M x M respectively. Sub-matrix elements 
were chosen at random (in matrix A) according to the 
guidelines mentioned above. Encoding was carried out 
straightforwardly by using the matrix B~ 1 A. The cor- 
rupted messages were decoded using the set of recursive 
equations of Ref. Q , using random initial conditions for 
the signal while the initial conditions for the noise vector 
where obtained according to the noise and signal prob- 
abilities Eq.(|). The prior probabilities of were chosen 
according to Eqs.([|) and (||). 

In each experiment, T blocks of iV-bit unbiased mes- 
sages were sent through a Gaussian noisy channel of zero 
mean and variance a 1 (enforced exactly); the bit error- 
rate, denoted pb, was monitored. We performed between 
T= 10 4 — 5 x 10 4 trial runs for each system size and noise 
level, starting from different initial conditions. These 
were averaged to obtain the mean bit error-rate and the 
corresponding variance. In most of our experiments we 
observed convergence after less than 100 iterations, ex- 
cept very close to the critical noise level. The main halt- 
ing criterion we adopted relies on either obtaining a so- 
lution to Eq. (^) or by the stationarity of the first N bits 
(i.e., the decoded message) over a certain number of iter- 
ations. One should also mention that the decoding algo- 
rithm's complexity is of O(N) as all matrices are sparse. 
The inversion of the matrix B is carried out only once 
and requires 0(1) operations due to the structure chosen. 

The construction used for the matrices in these two 
cases appear in table 1 as well as the maximal standard 
deviation for which Pb < 2 x 10~ 5 for a given message 
length N, the predicted maximal standard deviation <r£° 
once finite size effects have been considered (discussed be- 
low) and Shannon's maximal standard deviation a c de- 
fined in Eq.(||). These results, as well as other results 
reported here, could be improved upon by avoiding ma- 
trices with small loops and by replacing the method of 
belief propagation by belief revision (our random con- 
struction of the matrix A even allows for small loops of 
size one). It was shown that both improvements have 
a significant impact on the performance of this type of 
codes f|,[l5]]. With these improvements, the actual bit 
errors is expected to be typically lower than the reported 
value of Pb = 2 x 10~ 5 ; however, as we have been limited 
to about T = 5 x 10 4 trials per noise value we can only 
provide an upper bound to the actual error values. 



To compare our results to those obtained by using 
turbo codes |1| and in Ref. jl3) we plotted in Fig.l the 
two curves (dotted and dashed respectively), for N = 10 3 
and 10 , against the results obtained using our cascad- 
ing connection method (filled triangles). It is clear from 
the figure that results obtained using our method are su- 
perior in all cases examined. Furthermore, from table 1, 
one can conclude that the averaged connectivity, C in the 
case of R = 1/2 and 1/4 is 5 and 9 respectively for the 
matrix A and 3/2 for the matrix B. Similarly, the aver- 
aged K values for R = l/2 and 1/4 are K — 5/2 and 9/4, 
respectively. These number are much smaller than those 
used in Refs. 12 |l3] and other irregular constructions. 
Minimising K and C is of great interest to practitioners 
since decoding delays are directly proportional to the K 
and C values used [Q. 

It is clear from Fig.l that the finite size effects are sig- 
nificant in defining the code's performance. It is therefore 
desirable to find the performance in the limit of infinite 
messages which are also assumed in deriving Shannon's 
bound. We employ two main methods for studying the fi- 
nite size effects: a) The transition from perfect (m(er) = 1) 
to no retrieval (m(cr) = 0), as a function of the standard 
deviation cr, is expected to become a step function (at 
a^°) as N —>oo; therefore, if the percentage of perfectly 
retrieved blocks in the sample, for a given standard de- 
viation cr, increases (decreases) with N one can deduce 
that a < <7^° (or a > cr£°). b) Convergence times near 
criticality usually diverge as l/(<r£° — cr); by monitor- 
ing average convergence times for various cr values and 
extrapolating one may deduce the corresponding critical 
standard deviation. 

Both methods have been used in finding the critical 
values for R = 1/2 and R = 1/4; the results obtained 
appear in table 1. In Fig. 2 we demonstrate the two 
methods: we ordered the samples obtained for R=l/2, 
(7 = 0.915,0.935 (dashed and solid lines respectively) and 
N = 1000, 10000 (thin and thick lines respectively) ac- 
cording to their magnetisation; results with higher mag- 
netisation appear on the left and the x axis was nor- 
malised to represent fractions of the complete set of tri- 
als. One can easily see that the fraction of perfectly re- 
trieved blocks increases with system size indicating that 
a < <j^° . In the inset one finds log-log plots of the mean 
convergence times r for R — 1/2,1/4 and N — 10000 
carried out on perfectly retrieved blocks with less than 3 
error bits. The optimal fitting of expressions of the form 
t oc 1 / (c^? — f ) provides another indication for the cr£° 
values, which are consistent with those obtained by the 
first method. 

We end this presentation by discussing the main dif- 
ference between our method and those presented in Refs. 
]lT| [l3| . Firstly, our construction builds on sub-matrices 
of different K and L values keeping the connectivity in 
each of the columns as uniform as possible; this equates 
the corrections received by the various bits while allowing 
them to participate in different multi-spin interactions, so 
as to provide contributions of different types throughout 
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the dynamics. In contrast, other irregular codes build 
on the use of different column connectivities such that a 
small number of bits, of high connectivity, will lead the 
decoding process, gathering more corrected bits as the 
decoding progresses. Secondly, Refs. ]TT|-|l3t as well as 
others point to the need of high multi-spin interactions 
for achieving performance close to Shannon's bound; we 
show here that low K, L and C values are sufficient for 
near-optimal performance (in the case of R = 1/2 and 
1/4 the averaged connectivities are C = 5 and 9 respec- 
tively for the matrix A and 3/2 for the matrix B), al- 
lowing one to carry out the encoding and decoding tasks 
significantly faster. Our work suggests that it is possible 
to come very close to saturating Shannon's bound with 
finite connectivity, at least for the code rates considered 
here. It is plausible that operating close to R = 1 will 
require higher K, L values and may require infinite C or 
C values; this question is currently under investigation. 

We have shown that through a successive change in 
the number of multi-spin interactions (K and L) one can 
boost the performance of Gallager-type error-correcting 
codes. The results obtained here for the case of additive 
Gaussian noise suggests competitive performance to sim- 
ilar state-of-the-art codes for finite N values; extending 
the results to the case of infinitely large systems suggest 
that the current code is less than O.ldB from saturat- 
ing the theoretical bounds set by Shannon. It would be 
interesting to examine methods for improving the finite 
size behaviour of this type of codes; these would be of 
great interest to practitioners. 
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TABLE I. The critical noise standard deviation and a^ obtained by employing our method for various code rates in 
comparison to the maximal standard deviation a c provided by Shannon's bound. Details of the specific architectures used and 
their row/column connectivities are also provided. 
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FIG. 1. Bit-error rate pt as a function of the standard 
deviation for a given code-rate R — 1/2 for systems of size 
iV = 1000, 10000 (right and left respectively). Our results for 
each system size appear as black triangles, while results ob- 
tained via the turbo code and in Ref. [13] for systems of similar 
sizes appear as curves (dotted and dashed respectively). 



FIG. 2. The block magnetisations profile for R = 1/2, 
a = 0.915, 935 (dashed and solid lines respectively) and 
N = 1000, 10000 (thin and thick lines respectively), showing 
the sample magnetisation m vs. the fraction of the complete 
set of trials. A total of about 10000 trials were rearranged in 
a descending order according to their magnetisation values. 
One can see that the fraction of perfectly retrieved blocks 
increases with system size. Inset - log-log plots of mean con- 
vergence times r for N = 10000 and R — 1/2, 1/4 (white and 
black triangles respectively). The cr^° values were calculated 
by fitting expressions of the form r oc 1/(<t£° — a) through the 
data. 
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